surrogate gradient
- North America > United States > California > Alameda County > Berkeley (0.04)
- Asia > China (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.67)
- Health & Medicine (0.45)
- Government (0.45)
- North America > United States > California > San Diego County > San Diego (0.04)
- Asia > China (0.04)
positive feedback, and greatly appreciate the critical and constructive suggestions
Thank you for your valuable feedback, which is very helpful in improving the paper. We're encouraged by the broadly "Put this in the context of other work on computational homogenization / multi-scale finite element Our method is related to these and the boundary element method (BEM). "Limitation associated with micro-scale buckling... the coarse-grain behavior might exhibit hysteretic effects": Good "How sensitive is the outer optimization to the accuracy of the surrogate gradients?" "Do you know how the CES method scales with system size in terms of accuracy and evaluation time": In terms of "the method to solve the outer optimization over BCs to find minimum energy solutions to the composed surrogates Free DoFs are optimized to minimize total predicted energy using LBFGS. "The discuss of the surrogate and i.i.d. "Are the BCs shared when a boundary is common between two cells": Y es. We have 1 DoF for each blue point in Fig 2. "Its not clear how the HMC and PDE solver are used together": HMC is used to generate training BCs, preferring larger The PDE solver is used to compute the gradient of the pdf (which depends on E) w.r.t. the BC. Given BCs, we run the solver to determine the internal u and E. We compute dE/dBC with the Then we use this to compute the gradient of the pdf w.r.t. the BCs, needed for the leapfrog step. "does the HMC require a significant burn-in time before producing reasonable samples": No. Note: we don't truly care Per appendix, HMC took between 3 and 100 leapfrog steps per sample. The process of using the surrogates to solve the original problem can be explained in more detail. Newton method is neither the fast nor the most stable... a comparison with more sophisticated methods would be From a brief look it looks like Liu et al's method is tailored for Reviewer 5: "There is one outlier in L2 compression that was quite bad": We will discuss this in the main paper. "A comment might help the reader situate this work within the more usual (less idyllic) context of approximating This is a good suggestion: we will relate to other work in learning energies.
Fine-Grained Iterative Adversarial Attacks with Limited Computation Budget
Hou, Zhichao, Gao, Weizhi, Liu, Xiaorui
This work tackles a critical challenge in AI safety research under limited compute: given a fixed computation budget, how can one maximize the strength of iterative adversarial attacks? Coarsely reducing the number of attack iterations lowers cost but substantially weakens effectiveness. To fulfill the attainable attack efficacy within a constrained budget, we propose a fine-grained control mechanism that selectively recomputes layer activations across both iteration-wise and layer-wise levels. Extensive experiments show that our method consistently outperforms existing baselines at equal cost. Moreover, when integrated into adversarial training, it attains comparable performance with only 30% of the original budget. Adversarial attacks, which craft imperceptible perturbations to input data to degrade the performance of deep learning models, have become a central topic in the safety and robustness of AI systems. From the attack perspective, iterative adversarial methods such as Projected Gradient Descent (PGD) (Madry et al., 2017) are widely adopted as strong oracles to benchmark the robustness of modern neural networks.
- North America > United States > North Carolina (0.04)
- North America > United States > Michigan (0.04)
- Asia > Middle East > Jordan (0.04)
Adaptive Surrogate Gradients for Sequential Reinforcement Learning in Spiking Neural Networks
Berghe, Korneel Van den, Stroobants, Stein, Reddi, Vijay Janapa, de Croon, G. C. H. E.
Neuromorphic computing systems are set to revolutionize energy-constrained robotics by achieving orders-of-magnitude efficiency gains, while enabling native temporal processing. Spiking Neural Networks (SNNs) represent a promising algorithmic approach for these systems, yet their application to complex control tasks faces two critical challenges: (1) the non-differentiable nature of spiking neurons necessitates surrogate gradients with unclear optimization properties, and (2) the stateful dynamics of SNNs require training on sequences, which in reinforcement learning (RL) is hindered by limited sequence lengths during early training, preventing the network from bridging its warm-up period. We address these challenges by systematically analyzing surrogate gradient slope settings, showing that shallower slopes increase gradient magnitude in deeper layers but reduce alignment with true gradients. In supervised learning, we find no clear preference for fixed or scheduled slopes. The effect is much more pronounced in RL settings, where shallower slopes or scheduled slopes lead to a 2.1x improvement in both training and final deployed performance. Next, we propose a novel training approach that leverages a privileged guiding policy to bootstrap the learning process, while still exploiting online environment interactions with the spiking policy. Combining our method with an adaptive slope schedule for a real-world drone position control task, we achieve an average return of 400 points, substantially outperforming prior techniques, including Behavioral Cloning and TD3BC, which achieve at most --200 points under the same conditions. This work advances both the theoretical understanding of surrogate gradient learning in SNNs and practical training methodologies for neuromorphic controllers demonstrated in real-world robotic systems.
- Europe > Netherlands > South Holland > Delft (0.04)
- North America > United States (0.04)
Enforcing convex constraints in Graph Neural Networks
Rashwan, Ahmed, Briggs, Keith, Budd, Chris, Kreusser, Lisa
Many machine learning applications require outputs that satisfy complex, dynamic constraints. This task is particularly challenging in Graph Neural Network models due to the variable output sizes of graph-structured data. In this paper, we introduce ProjNet, a Graph Neural Network framework which satisfies input-dependant constraints. ProjNet combines a sparse vector clipping method with the Component-Averaged Dykstra (CAD) algorithm, an iterative scheme for solving the best-approximation problem. We establish a convergence result for CAD and develop a GPU-accelerated implementation capable of handling large-scale inputs efficiently. To enable end-to-end training, we introduce a surrogate gradient for CAD that is both computationally efficient and better suited for optimization than the exact gradient. We validate ProjNet on four classes of constrained optimisation problems: linear programming, two classes of non-convex quadratic programs, and radio transmit power optimization, demonstrating its effectiveness across diverse problem settings.
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
- Europe > Germany > Baden-Württemberg > Freiburg (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- Research Report > Experimental Study (0.92)
- Research Report > New Finding (0.67)
- Health & Medicine (0.45)
- Government (0.45)